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METHOD FOR INSERTING AUXILIARY DATA IN AN AUDIO DATA STREAM 



Reference to Related Applications 

This application is a continuation of International Application No. PCT/GB99/02473. 
whose international filing date is July 29. 1999. which in turn claims the benefit of Great 
Britain Application No. 9816518.6. filed July 29. 1998. the disclosures of which 
5 Applications are incorporated by reference herein. The benefit of the filing priority dates 
of the International and Great Britain Applications is respectfully reguested. 

The present invention relates to embedding of data or synchronisation signals in 
another data stream. The invention is particularly concerned with inserting information 
10 into a data stream which has been or is intended to be coded, particularly compressed, 
a particular example being from a linear digital format such as PCM to an MPEG (or 
similar) audio bitstream. Details of MPEG audio coding are defined in ISO/IEC 
standards IS 11172-3 and IS 13818-3. 

15 WO-A-98/33284, the disclosure of which is incorporated herein by reference, describes 
a method of audio signal processing in which auxiliary data is communicated with a 
decoded audio signal to assist in subsequent re-encoding of the audio signal. Several 
methods of communicating the data are disclosed; however, the inventor has found that 
there is room for improvement of the methods disclosed in that application. 



The inventor has appreciated that another application in which it would be useful to 
carry additional data with an audio bitstream is to establish frame boundaries and 
synchronisation with a previously coded signal. In particular, WO-A-99/04572, 
incorporated herein by reference, discloses a method of re-encoding a previously coded 
25 signal in which the signal is analysed to determine previous coding characteristics. The 
inventor has appreciated that if some form of synchronisation information were 
embedded in the signal, the analysis could be simplified. 
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There has been discussion of carrying additional data in an audio data signal, for 
example to carry surround sound information, by inserting the data so as to be nearly 
imperceptible; proposals of this kind however generally involve complex proprietary 
signal processing and are not designed to accommodate further coding of the signal. 

5 

The invention aims to provide a method of communicating data or synchronisation 
information together with a main data signal without unduly affecting the transmission of 
the main data signal. 

10 In a first aspect, the invention provides a method of inserting auxiliary digital data in a 
main digital data stream which is subsequently to be coded to produce a coded data 
stream (or which has been decoded from a coded data stream), the method comprising 
identifying at least one component of the main data stream which will make substantially 
no contribution to the coded data stream (or which was not present in the coded data 

15 stream) and inserting data from the auxiliary data stream in the or each component. 

In this way, the eventual coded data stream will be substantially unaffected by the 
insertion of the auxiliary data, so there will be no overall degradation or distortion 
introduced by the extra data. However, the auxiliary data will have been carried "for 

20 free" with the main data signal until it reaches the coder. Although the invention will 
normally be employed in conjunction with data which is to be coded subsequently (in 
which case the auxiliary data may be removed at or around the time of coding), the 
invention may be employed with data which has previously been coded but is not 
necessarily to be coded further; this still provides the advantage that the carrying of 

25 additional information cannot degrade the data further as no "real" information is 
overwritten by the auxiliary data. 

A further potential advantage is that, because the insertion of data is based on the 
principles used in coding, components can be shared between the data insertion 
30 apparatus and a coder or decoder, particularly when integrated as a unit including a 
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data insertion function and a coding or decoding function, rather than requiring bespoke 
design. The auxiliary data may be carried further with the coded data stream, but no 
longer embedded in the main data stream. For example, in the case of coded audio, the 
coded data format may allow the auxiliary data to be carried directly as data in addition 
5 to the coded audio. The auxiliary data is preferably used to assist in choosing coding 
decisions or in synchronising the coder with a previous coder. The main data signal is 
preferably an audio signal, but may be a video or other signal. 

Whilst the invention is primarily concerned with adding information to a digital main data 
10 signal, it is to be appreciated that this digital signal can be converted into other forms; 
for example a linear PCM digital signal carrying embedded digital data or a 
synchronisation signal may be converted to analogue form and back again and provided 
the conversion is faithful, the data may be recovered, or at least the synchronisation 
signal may be identified. 

15 

The method may further include extracting the auxiliary data and coding the main data. 
At least one coding parameter or decision is preferably based on the auxiliary data. 

Preferably coding includes quantising data words corresponding to said main digital 
20 data stream or, more preferably, a transformed data stream to a plurality of levels less 
than the number of levels codable by said data words. The component of the main data 
stream may corresponds to less significant bits of coded data words which are to be 
quantised by said coding to one of a predetermined number of levels, the number of 
levels being less than the number of levels encodable by the data words. For example, 
25 if an n-bit word is to be quantised by coding to 2 A m levels, where m<n, n-m bits may be 
available to carry additional data. 

Preferably, the change in the data stream effected by insertion of the auxiliary data is 
substantially imperceptible, for example below (or at) the audible noise floor in the case 
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of audio data or having substantially no perceptible effect on picture quality in the case 
of a video signal. 

Preferably inserting the auxiliary data comprises inserting the data into unused sub- 
5 band samples of a transformed set of data. 

In a preferred application, the main data comprises audio data to be coded according to 
an MPEG-type audio coding scheme (by which is meant any similar coding scheme 
based on the principle of quantising a plurality of sub bands or other components into 
10 which the signal is analysed) and identifying at least one component comprises 
identifying sub-bands which are unoccupied or identifying quantisation levels, the 
auxiliary data being inserted in unoccupied bands or at a level below the quantisation 
noise floor. 

15 This may be provided independently in a related but independent aspect, in which the 
invention provides a method of inserting auxiliary data into an audio data stream to be 
coded by analysing the audio data into a plurality of sub-bands and quantising the sub- 
bands, the method comprising estimating sub-bands and quantisation levels for a 
subsequent or previous coding and inserting the auxiliary data at a level substantially 

20 below the level of estimated quantisation noise. 

Estimating sub-bands and quantisation levels may include transforming the (audio) data 
from the time domain (or an uncoded domain) to the frequency domain (or a coded 
domain) or otherwise analysing the data into a plurality of subbands, for example using 
25 a Fourier or the like transform. Data may be inserted in the frequency domain, and the 
modified frequency domain data may be transformed back to the time domain. 

A particular advantage arises when the estimated sub bands or quantisation levels 
correspond directly to sub bands or quantisation parameters which have been or will be 
30 used in coding of the data; there is substantially no effect on the coded signal, as the 
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component(s) of the main data signal which are used to carry the auxiliary data would 
otherwise be lost by the coding process. 

The data to be carried may comprise a defined synchronisation sequence; this may 
5 facilitate detection of frame boundaries and the like and may be employed to facilitate 
extraction of other data or to minimise degradation between cascaded coding and 
decoding operations. 

The auxiliary data or synchronisation signal may be inserted into an upper subband of 
10 the main data. 

In a further aspect, the invention provides a method of carrying a synchronisation 
sequence with a main digital data signal, preferably an audio signal, for example a 
linear PCM audio signal, comprising inserting a defined sequence of synchronisation 
15 words into a component of the main data signal, preferably an unused subband, to 
facilitate identification of or synchronisation with previous coding of the signal. 

The invention also provide a method of detecting a frame boundary or establishing 
synchronisation with a data signal produced by the above method comprising searching 
20 for a sequence of synchronisation words in said component of the data signal and 
comparing at least one value found, or a derived value to a stored sequence of values. 

The invention further provides a digital data signal, preferably a linear PCM audio 
bitstream, comprising an audio signal and at least one of a synchronisation sequence or 
25 an auxiliary data signal embedded in an otherwise unused subband or in subbands 
below an MPEG quantisation noise floor. 

The invention extends to apparatus for inserting auxiliary data into a data stream and to 
data streams coded by the above method. 

30 
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Embodiments of the invention will now be described by way of example, with reference 
to the accompanying drawings in which: 

Fig. 1 shows schematically cascaded MPEG-type coding and decoding 
5 transformations; 

Fig. 2 shows bit allocation for a typical signal; 

Fig. 3 shows scalefactors and the lowest level that can be coded for the signal of 

10 Fig. 2; 

Fig. 4 shows space determined to be available for data transmission in 
accordance with the invention; 

1 5 Fig. 5 is an illustration of the effect of 32-sample alignment on an ID sequence 

Fig. 6 shows an example synchronisation signal; 

Fig. 7 shows insertion and extraction of the synchronisation signal. 

20 

A preferred application of the invention involves carrying additional data with an audio 
signal which is to be coded according to MPEG audio coding. The basic principles will 
be described, to assist in understanding of the invention. 

25 Carrying data with MPEG audio signals -basic principles 

MPEG audio uses the idea of psychoacoustic masking to reduce the amount of 
information to be transmitted to represent an audio signal. The reduced information is 
represented as a bitstream. Psychoacoustic masking is usually calculated on a 
frequency representation of an audio signal. In MPEG audio a filterbank is used to split 

30 the audio into 32 subbands, each representing part of the spectrum of the signal. 
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The encoder uses a psychoacoustical model to calculate the number of bits needed to 
code each of these subbands such that the quantisation noise inserted is not audible. 
So, in each subband, only the most significant bits are transmitted. 

5 

In this embodiment, the aim is to carry data along with audio in a linear digital PCM form 
(although other digital formats may be employed). The data should be carried inaudibly 
and be capable of being fully recovered without data loss. We have found that, 
depending on the bit-rate used for the MPEG encoding and the nature of the signal, it is 
10 possible to transmit between 50 and 400 kbits/sec of data under a stereo audio signal. 

General applications of data-carrying possible with the embodiment include carrying 
associated data with the audio, such as text (e.g. lyrics). In addition, a specific use of 
the invention, to be described in more detail below, arises if a signal is already in MPEG 
15 coded form or has been previously coded but needs to be conveyed in linear form; here 
the extra data can contain details of the coding process or synchronisation information 
to assist in subsequent re-coding, or pictures associated with the audio. 

The filterbanks in MPEG audio have the property of (nearly) perfect reconstruction. A 
20 diagram of a decoder to an encoder is shown in Fig. 1. If the filterbanks (102. 104) are 
aligned correctly then the subband samples (106) in the encoder will be practically 
identical to those (108) that originated in the decoder. 

When an encoder encodes the signal it attempts to allocate enough bits for each 
25 subband such that the resulting signal is not audibly different from the original. 

Selection of components for carrying data 

Given these two properties, we have appreciated that data can be inserted into the 
subbands below the level of the significant audio signal such that the inserted data is 
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inaudible (or at least not introducing any impairments beyond those of the MPEG 
encoding). 

Fig. 2 shows the measured level (202) of the audio in each subband, coded as 
5 "scalefactors" in the MPEG audio bitstream. It also shows the bit allocation (204) 
chosen by an encoder. This is specified as the number of quantisation levels for a 
particular subband. In the diagram, the bit allocation is represented as a signal-to-noise 
ratio, in dB terms, to permit representation on the same axis. For this purpose, each bit 
that is needed to represent the number of quantisation levels is approximately 
1 0 equivalent to 6dB of "level". 

If instead we show the scalefactors (302) and the lowest level that can be encoded 
(304) w ith the bit allocation from Fig. 2 we get the graph in Fig. 3. 

15 One can see that the levels below the lowest level are unused. As the MPEG model has 
determined that there is no audible information below these lowest levels we are free to 
use them for data. 

Given the constraint that we should not interfere with the audio, levels near that of the 
20 lowest level will not be used. This should also mean that no clipping problems are 
introduced. Given also that the signal is probably to be transmitted or stored over a 
linear medium with limited resolution (e.g. 16 bits), this imposes a constraint on the 
lowest level we can send. Due to inaccuracies in reconstruction because of truncation to 
PCM and limits on accuracy in the filterbank calculation, it is unwise to use the levels 
25 closest to the PCM quantisation limit (e.g. the 16th bit). In the case of subbands where 
no information is to be sent two strategies are available. 

If we are decoding an MPEG bitstream to insert data, we would not know the level of 
that subband so, to be safe, we should probably not send any data in that subband. If, 
30 on the other hand, we are using an encoder purely for generating data we could use the 
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levels just below the full level in this subband. A diagram showing the area where the 
data could be inserted (402) , for the latter case, is shown in Fig. 4. 

In the case of subbands containing an audio signal, the level of the data will be below 
5 the most significant levels. Data could also be inserted into other subbands, below the 
level of audibility or above the range of normal hearing (e.g. in the subbands not used in 
MPEG encoding). 

Practical Implementation Details 
10 For a practical implementation several issues need to be addressed, in particular how 
the data is inserted and how the data is recovered. Data could be inserted when 
decoding an MPEG audio bitstream or the functions of an encoder and decoder could 
be combined to filter the signal, analyse it, quantise the audio appropriately, insert the 
data, then convert the signal back to the PCM domain. 

15 

Data insertion 

A proposed method of data insertion is first to calculate the number of bits available and 
then mask subband values with the data before they are fed to the synthesis filterbank. 
A 16-bit system is assumed, but the calculations are similar for a larger number of bits. 
20 The scheme described below is simple and safe. 

Calculation of the bits available 

Take the maximum scalefactor for a subband as representing a maximum value signal 
that can be conveyed in a 16-bit PCM system. Then consider that approximately 96dB 

25 below this is the quantisation floor of the 16-bit PCM system. Scalefactors are defined in 
2dB steps. Once the scalefactor for a given subband is calculated determine the 
difference between this and the noise floor in dB (the range, R). The MPEG 
psychoacoustic model will give the bit allocation. Translate the bit allocation for the 
subband to a signal-to-noise figure in dB (Q). Thus calculate the range in dB available 

30 for data (D) from the quantisation floor to the lowest level represented. 
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D = R-Q 

Then subtract the safety margins of 1-bit near the signal and another bit near the noise 
5 floor, remembering 1-bit is approximately equivalent to 6dB signal-to-noise. 

D = D-12 

Next allocate a number of data bits (N ) per subband by finding the integer number of 
10 bits that can be represented within D by doing an integer division on D. 

N = int( D / 6 ) 

This value is valid for a particular subband and scalefactor. In MPEG Layer 2 there are 
15 up to 3 different scalefactors per frame so each could have its own number of bits or the 
minimum could be taken of all 3 scalefactors. 

Masking the data onto the subband value 

From the procedure described above the number of bits available (N) is used to create 
20 a mask (M). 

M = Oxffff < < (N + 1 ) for a 1 6-bit system 

The subband value is then converted to a 16-bit integer, masked with this value 
25 and the data inserted onto the N Least Significant Bits (excluding the last bit of course) 
to give a sample S. To ensure the most accurate representation of the signal a rounding 
value is added to S, +0.5 if the signal is positive and -0.5 if it is negative. This gives 
almost perfect reconstruction in the analysis filter and the data is recovered perfectly. 
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An easy method of inserting the data is to treat the data as a bitstream and insert as 
many bits into each subband as possible. However, to indicate synchronisation it would 
be useful to put a sequence into consecutive (in time) values of subband values so that 
a whole frame can be identified. 

5 

Data Extraction 

To extract the data from the signal, alignment of the filterbanks and a method of 
describing where the data is (the bit allocation) and how it is organised are needed. 
These points are addressed below. 

10 

Synchronisation 

To extract the data, synchronisation with the 32-sample and frame structure of the audio 
signal are needed. A separate synchronisation signal could be sent or this signal could 
be included in the data sent. Another possibility is to deduce the 32-sample boundary 
15 and then use a synchronisation word within the data to identify the frame boundary. This 
aspect is discussed further below. 

Bit allocation 

To extract the data, the position of the data within the subbands must be known. There 
20 are several options for how this information is conveyed: 

The bit allocation could be implicit by having the same psychoacoustic model in 
the receiver of the data as in the transmitter. 

25 The bit allocation could be signalled separately, e.g. in an upper unused 

subband, in the user bits of an AES/EBU bitstream or by another technique that 
does not interfere with the system described above. 

The bit allocation can be contained within the space for data, with mechanisms 
30 provided to signal the location of the bit allocation. 
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This last option is discussed below. 
Data organisation 

5 If the bit allocation is known then the data can be carried in whatever form is suitable for 
that particular data. A checksum is advisable as well as a synchronisation word to 
define the start of the frame and/or data. If the bit allocation is to be carried within the 
data then the dynamic nature of the bit allocation must be taken into account. 

10 An example layout for MPEG Layer 2 audio, using only 1 bit allocation per frame (i.e. 
not taking into account the 3 possibly different scalefactors) will be discussed. 

A synchronisation word is needed to show where the frame starts. This needs to be 
followed by the bit allocations for each subband, preferably with a checksum and then 
15 followed by the data itself, again preferably with a checksum. The synchronisation word 
should be followed by a pointer to the space where the bit allocation is contained. Due 
to the dynamic nature of the bit allocation, the following manner of organisation would 
be appropriate, with the information preferably appearing in the order listed (details may 
change): 

20 

Synchronisation word 

This should ideally be placed in the lowest subband with data space available, 
usually the first subband. The sequence may be placed 1 bit at a time into 
consecutive (in time) subband values, in the lowest bit available for data 
25 transmission. The data receiver may have to search for this word if the sync word 

is not placed in the first subband. There are a minimum of 36 bits available in a 
subband per frame and, for example, 18 bits can be used for the sync word. 



- 13- 

SUBSTITUTE SPECIFICATION (Marked-up Version) 



Pointer to bit allocation 

This should point to subbands that have data space available to store the bit 
allocation. Assuming we use 4 bits per subband to describe the bit-allocation for 
that subband, with 32 subbands we need 128 bits in total. So, given that we have 
multiples of 36 bits available per subband per frame, we need to be able to point 
to areas containing 4 times 36 bits. Given that there are 18 bits available in the 
synchronisation subband, one possibility is to use a 4-bit pointer to a subband 
and a 2-bit count of the number of bits available. The 4-bit pointer can indicate an 
offset upwards to the next subband (with the range 1 to 16). The 2-bit count can 
be from 1 to 4 bits, as 4 is the maximum number we need. We could then have 
three of these pointers in the first subband. An exception case could be defined if 
we only have subbands with 1 bit available. 

Bit allocation 

This should contain 32 times 4-bits to indicate the number of bits available per 
subband. It should ideally be followed by a 16-bit checksum to ensure the data is 
correct, making a total of 144 bits. 

The data can then follow the above header information. 

The above scheme has an overhead of 180 bits per frame, which is approximately 6900 
bits per second per audio channel at 44.1 kHz. 

The implementation described above is suitable for carrying whatever data is desired, 
for example lyrics, graphics or other additional information. Another possibility is, 
particularly where the data has been previously coded, to carry information on previous 
coding decisions, for example to reduce impairment in signal quality caused by 
cascaded decoding and recoding, or to simplify subsequent coding. 
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A further possibility is to carry a synchronisation signal or data word (in addition to 
further data or alone) either to assist in establishing synchronisation (as mentioned 
above) or to facilitate recoding of a previously coded signal by deducing previous coding 
decisions. An arrangement for carrying a synchronisation signal will now be described. 

5 

Carrying a synchronisation signal 

The technique to be described below enables deduction of synchronisation from the 
characteristics of the signal itself, rather than added data. It is also capable of surviving 
a level change. To assist in understanding, the basic principles of MPEG audio, 
10 discussed above, will be summarised again, with reference to this specific 
implementation. 

Synchronisation with MPEG-type audio - Basic Principles 

MPEG audio uses a filter to split the audio into different subbands. The PCM input 
15 samples are transformed into corresponding subband samples by an analysis filter. 
These samples are then transformed back into PCM samples by a synthesis filter. 
There is an inherent delay in this process, dependent on the design of the filterbanks. 

For each 32 input PCM samples the analysis filter produces 32 values, one for each 
20 subband. This group of subband values is known as a "subband sample". In MPEG 
audio a fixed number of PCM samples, a frame, are grouped together to make the 
coding more efficient. MPEG Layer 2, for example, uses a frame length of 1152 PCM 
samples, which is equivalent to 36 subband samples. Information is then carried in the 
MPEG bitstream about this whole frame, e.g. the number of bits per subband and the 
25 level of each subband as well as the quantised subband values. 



The nature of the filterbank is such that when re-encoding a previously encoded signal, 
the original subband samples will only be recovered if the PCM samples going into the 
analysis filterbank line up to the same 32-sample boundary as used in the original 
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encoding. If the filterbank 32-sample boundaries are not aligned extra noise will appear 
in the subbands. 

In order to code the audio again optimally it would be useful to know where the 32- 
5 sample boundary is, to avoid inserting extra noise. It would also be useful to know 
where the frame boundary is, so that calculations of the appropriate bit-allocation are 
based on exactly the same signal. In theory this could lead to transparent re-encoding. 

In this application of the invention, the aim is to insert a specific identification sequence 
10 into a subband in a decoder, which will then be embedded in the linear PCM output. A 
subsequent encoder can use this information to deduce the 32-sample boundaries in 
the original encoding and/or to deduce the frame boundary upon which the original 
encoding was based. 

15 An advantage of the technique now being described is that deduction is direct from 
performing a filterbank on the audio. By inserting this identification sequence into an 
upper subband, the signal will be inaudible and continually present. It could alternatively 
be inserted into a lower subband, on its own as an identification signal or carried 
underneath the audio signal. A suitable identification signal could still be decoded after 

20 a level change. 

Inserting identification sequence 

By inserting a suitable identification sequence into a subband, the original values of this 
sequence will only be recovered exactly when the original 32-sample boundary of the 

25 initial analysis filter is matched in the current analysis filterbank. Thus if the PCM audio 
is offset by something other than 32 samples another unique sequence will be 
produced. From this the original 32-sample boundaries can be determined. If the 
sequence is unique across the length of a frame (e.g. 1152 PCM samples for Layer 2, 
equivalent to 36 consecutive values in 1 particular subband), the frame position can 

30 also be easily deduced. An illustrative sequence is shown in Fig. 5. 
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If a gain change is applied to the PCM audio signal, only the relative levels of the 
identification sequence will be changed. Thus the same information could still be 
deduced, dependent on the inserted level of the identification sequence. By careful 
5 choice of a suitable identification sequence the frame position can be calculated with 
only a subset of its 36 samples. The sequence preferably comprises at least 4 words. 

Example identification Sequence 

An example synchronisation sequence (602) , shown in Fig. 6, consists of a sine wave 
10 with certain points set to zero. This can be inserted into an upper subband, e.g. 
subband 30. For 48kHz sampling this is above the maximum subband (27) defined by 
the MPEG standard. Thus this extra synchronization signal would not be coded by a 
"dumb" encoder. 

15 This sequence (700 of Fig. 7) should be inserted into an appropriate subband before the 
synthesis filter (soo F i g. 7) (702 of Fig. 7) . The analysis filter would then produce 
subband samples from which the frame and 32-sample boundary can be deduced. 

To analyse the offset the modified encoder can use the following simple procedure 
20 (assuming it has no synchronisation information at the moment): 

Move in the next 32 PCM samples and run the filterbank to obtain a subband 
sample. 

25 Extract the value from the appropriate subband (e.g. 30). 

Check this value against a table of all known possible values for all offsets. (A 
table of 32 by 36 values.) 
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If a match has been found, run the filterbank again a couple of times and check 
the consecutive values in the table. 

Derive the exact sample offset required from the position in the table. 

5 

When the filterbank is run again with the correct offset, the alignment can be 
double-checked very easily. 

If the synchronisation signal is defined carefully to give unique values for all the offsets 
10 and positions the number of comparisons can be kept to a minimum. The 
synchronisation signal defined above would give a definite answer after running the 
filterbank 4 times, i.e. with just 4 subband samples. It is possible to define other 
synchronisation signals which would indicate the delay directly, but there is a trade-off in 
how much processing power is required to perform the filterbank against the time 
1 5 required for searching tables and deriving values. 

A procedure for determining synchronisation when gain has been applied to the signal 
is similar in principle to the above, but the relative levels of consecutive samples should 
be used. E.g. if the subband values are A.B.C,... then a table of A/B,B/C,... would be 
20 used. This may impose further requirements on the synchronisation signal. The above 
signal could also indicate if there had been a phase inversion of the audio. 

To recap, techniques have been described for carrying data "transparently" in a data 
stream in a manner which is compatible with subsequent or previous coding, particularly 
25 MPEG-type audio coding. Techniques for establishing synchronisation with a previously 
coded signal have also been described. The invention may be extended to other 
applications and the preferred features mentioned above may be provided 
independently unless otherwise stated. 
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Claims 

1. A method of inserting auxiliary digital data in a main digital data stream which main 
digital data stream is subsequently to be coded according to a defined coding scheme 
to produce a coded data stream or which main digital data stream has previously been 
coded according to a defined coding scheme to produce a coded data stream and 
decoded , the method comprising identifying at least one component of the main digital 
data stream which will make substantially no perceptible contribution to the 
subsequently coded data stream or which made substantially no perceptible 
contribution to the previously coded data stream and inserting data from the auxiliary 
data stream in the or each component to produce an output data stream carrying the 
auxiliary data. 

2. A method according to Claim 1 wherein the main data comprises audio data to be 
coded according to an MPEG-type audio coding scheme and identifying at least one 
component comprises estimating sub-bands which are unoccupied or estimating 
quantisation levels, the auxiliary data being inserted in unoccupied subbands or at a 
level below (or at) the quantisation noise floor. 

3. A method of inserting auxiliary data into an audio data stream which audio data 
stream is subsequently to be coded according to a defined coding scheme by analysing 
the audio data into a plurality of sub-bands and quantising the sub-bands or which audio 
data stream has previously been coded according to said defined coding scheme and 
decoded , the method comprising estimating sub-bands and quantisation levels for a 
subsequent or previous coding and inserting the auxiliary data at a level substantially 
below the level of estimated quantisation noise. 

4. A method according to any pr e c e d i ng c l a i m Claim 1 , further comprising coding the 
output data stream. 
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5. A method according to Claim 4, comprising adjusting or selecting at least one 
parameter or decision associated with said coding in dependence on data from the 
auxiliary data stream. 

6. A method according to Claim 4 or S wherein the auxiliary data is extracted prior to 
or during said coding. 

7. A method according to anv preceding c l aim Claim 1 wherein coding includes 
quantising data words corresponding to said main digital data stream, or a transformed 
version thereof, to a plurality of levels less than the number of levels codable by said 
data words. 

8. A method according to Claim 2 or 3 or any dopondont claim thoroon wherein 
estimating sub-bands and quantisation levels includes transforming the audio data from 
the time domain to the frequency domain. 

9. A method according to Claim 8 wherein the auxiliary data is inserted in the 
frequency domain to produce modified frequency domain data, and the modified 
frequency domain data is transformed back to the time domain. 

10. A method according to anv preced i ng c l a i m Claim 1 . including decoding a 
previously coded data stream to generate said main digital data stream, wherein 
identifying the or each component or estimating sub-bands and quantisation levels is 
based on information concerning the previous coding. 

11. A method according to anv pr e c e d i ng c l a i m Claim 1 wherein the auxiliary data is 
used to establish synchronisation with or to maintain consistency with a previous coding 
of the main data stream. 
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12. A method according to anv pr e c e ding cla i m Claim 1 wherein the auxiliary data to 
be carried includes a defined synchronisation sequence. 

13. A method according to anv prec e d i ng c l aim Claim 1 w herein the main digital data 
stream has at least one upper subband and wherein the auxiliary data or 
synchronisation signal is inserted into af»a said at least one upper subband of tho main 

14. A method of carrying a synchronisation sequence with a digital audio signal 
which digital audio signal has previously been coded according to a defined coding 
scheme, the method comprising inserting a defined sequence of synchronisation words 
into a component of the digital audio signal to facilitate identification of or 
synchronisation with previous coding of the signal , the component being chosen so that 
the seguence is substantially imperceptible . 

15. A method according to Claim 12 or 14 wherein the sequence comprises at least 4 
words. 

16. A method of detecting a frame boundary or establishing synchronisation with a 
data signal produced by anv of c l aims 12. 14 or 15 Claim 14 comprising searching for a 
sequence of synchronisation words in said component of the data signal and comparing 
at least one value found, or a value derived therefrom, to a stored sequence of values. 

17. A method according to anv pr e c e d i ng cla i m Claim 1 . wherein the auxiliary data or 
the synchronisation sequence is inserted at a decoder which generates the main digital 
data sional /the audio data stream/the digital audio signal from a previously coded 
signal. 
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18. A digital data stream produced by a method according to any preced i ng 
etai mClaim 1 . 

19. An uncoded digital data stream, preferably a linear PCM audio bitstream, 
comprising an audio signal and at least one of a synchronisation sequence or an 
auxiliary data signal embedded in an otherwise unused subband or in subbands below 
an MPEG quantisation noise floor of a coding process . 

20. Apparatus for inserting auxiliary data into a data stream comprising-meaes: 

an input module for receiving a main digital data stream which main digital data 

stream is subsequently to be coded according to a defined coding scheme to produce a 
coded data stream or which main digital data stream has previously been coded 
according to a defined coding scheme to produce a coded data stream and decoded; T 
m e ans 

a selection module for identifying at least one component of the main data stream 

which will make substantially no perceptible contribution to the subseguently coded data 
stream or which made substantially no perceptible contribution to the previously coded 
data stream; and-meare 

an insertion module for inserting auxiliary data in the or each component to 

produce an output data stream carrying the auxiliary data. 

21. Apparatus according to Claim 20 wherein the i dent i fying means selection module 
comprises meae san estimator for estimating sub-bands which are unoccupied or 
meafts an estimator for estimating quantisation levels of an MPEG-type audio coding 
process. 

22. Apparatus according to Claim 20 wherein one or more of said input module, 
selection module and insertion module are implemented at least partially in software by 
a processor. 
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2223. Apparatus for inserting auxiliary data into an audio data stream which audio data 
stream is subsequently to be coded according to a defined coding scheme by analysing 
the audio data into a plurality of sub-bands and quantising the sub-bands or which audio 
data stream has previously been coded according to said defined coding scheme and 
decoded , the apparatus comprising-mea«s: 

an estimation module for estimating sub-bands and quantisation levels for a 

subsequent or previous coding; and m e ans 

an insertion module for inserting the auxiliary data at a level substantially below 
the level of estimated quantisation noise. 

2324- Apparatus according to Claim 21 or 22 wherein the means for estimating sufe- 
bands and quantisat i on le v el s module includes m e ans for a transform module for 
transforming the audio data from the time domain to the frequency domain. 

2425. Apparatus according to Claim 2324 including means a modification module for 
inserting the auxiliary data in the frequency domain to produce modified frequency 
domain data and meafi sa reverse transform module for transforming the modified 
frequency domain data back to the time domain. 

2S26. Apparatus according to Claims 20 to 24 comprising a decoder for decoding a 
previously coded data stream to generate said main digital data stream or said audio 
data stream . 

2§27. Apparatus according to Claim 2625, wherein the m o ans for i dent i fying th e or 
each compon e nt or estimating sub bands and quant i sat i on le v e ls estimation module is 
arranged to use information concerning the previous coding. 

2728. Apparatus according to Claim 2§-9f 26 arranged to insert auxiliary data for use in 
establishing synchronisation with or maintaining consistency with a previous coding of 
the main data stream. 
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2329. Apparatus according to any of Claims 20 to 27 arranged to insert a defined 
synchronisation sequence as at least part of the auxiliary data. 

2930. Apparatus according to any of Claims 20 to 28 arranged to insert the auxiliary 
data or synchronisation signal into an upper subband of the main digital data stream . 

3031.- Apparatus for processing a digital audio signal arrang e d to which digital audio 
signal has previously been coded according to a defined coding scheme, the apparatus 
comprising means for inserting a synchronisation sequence comprising a defined 
sequence of synchronisation words into a component of the mai ftdigital audio signal to 
facilitate identification of or synchronisation with previous coding of the signa l wherein 
the component is chosen so that the inserted data will be substantially imperceptible . 

3432. Apparatus according to Claim 28 or 30 31 wherein the sequence comprises at 
least 4 words. 

3233. Apparatus according to any of Claims 20 to 31 , further comprising m e ans a coder 
for coding the output data stream. 

3334- A coder for coding a digital data stream produced by a method according to af*y 
ef-Claims 1 to 17 or apparatus accord i ng to any of C l a i ms 20 to 31 arranged to extract 
said auxiliary data prior to or as part of coding the signal. 

3435. A coder according to Claim 3334 including means for adjusting or selecting at 
least one parameter or decision associated with coding in dependence on data from the 
auxiliary data stream. 

3536. Apparatus for detecting a frame boundary or establishing synchronisation with a 
data signal produced by a method according to any of Claims 12 , 1 4 or 15 comprising 
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means for searching for a sequence of synchronisation words in said component of the 
data signal and comparing at least one value found, or a value derived therefrom, to a 
stored sequence of values. 
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Abstract of the disclosure 

Auxiliary digital data are inserted into a main digital data stream, to be 
subseouentlv coded to produce a coded data stream, bv identifying a component of the 
main data stream that will make substantially no contribution to the coded data stream- 
It is into this component that data from the auxiliary data stream is inserted. The main 
digital data stream may comprise MPEG coded audio data, and the component (which 
represents unoccupied sub-bands or being at a level at or below a Quantization noise 
floor) identified bv estimating sub-bands that are unoccupied, or estimating guantization 
levels. 
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